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Abstract 

Analysis of degree-degree dependencies in complex networks, and their impact on processes 
on networks requires null models, i.e. models that generate uncorrelated scale-free networks. 
Most models to date however show structural negative dependencies, caused by finite size 
effects. We analyze the behavior of these structural negative degree-degree dependencies, us¬ 
ing rank based correlation measures, in the directed Erased Conhguration Model. We obtain 
expressions for the scaling as a function of the exponents of the distributions. Moreover, we 
show that this scaling undergoes a phase transition, where one region exhibits scaling related 
to the natural cut-off of the network while another region has scaling similar to the structural 
cut-off for uncorrelated networks. By establishing the speed of convergence of these struc¬ 
tural dependencies we are able to asses statistical significance of degree-degree dependencies 
on hnite complex networks when compared to networks generated by the directed Erased 
Configuration Model. 


1 Introduction 

The tendency of nodes in a network to be connected to nodes of similar large or small degree, 
called network assortativity, degree mixing or degree-degree dependency, is an important char¬ 
acterization of the topology of the network, influencing many processes on the network. It has 
received significant attention in the literature, for instance in the field of network stability |31] . 
attacks on P2P networks m and epidemics [Sill]. 

An important method to analyze these degree-degree dependencies or their influence on other 
network properties or processes on the network, is to compare results to an average over several 
instances of similar networks with neutral mixing. These null models often come in two flavors. 
The first approach is to sample from graphs with the same degree sequence but neutral mixing. A 
widely accepted methodology for such sampling is through the local rewiring model, |19| . which 
takes the original network and randomly swaps edges until a randomized version is attained. 
The disadvantage of these methods is that they have no theoretical performance guarantees. 
The second approach is to generate a random graph with neutral mixing, which preserves basic 
features, such as the degree distribution. A well known model of this type is the Configuration 
Model (CM) mini US]. Here the degrees of vertices are drawn independently from the given 
distribution, under the restriction that the total sum of degrees is even. Then the stubs are paired 
uniformly at random to form edges. If we want to obtain a simple graph in this way, we can either 
rewire till a simple graph is generated (Repeated Configuration Model), or we remove the excess 
edges and self loops (Erased Configuration Model). 

We note that there are many other methods, that generate simple random graphs and have 
theoretically established performance guarantees. For example, sequential algorithms based on 
the properties of graphical sequences were proposed for undirected networks min] and directed 
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Wikipedia 

N 

iVl/2 

7+ 

7- 

maxD’*' 

maxD 

DE 

1,532,978 

1,238 

1.80 

1.05 

5,032 

118,064 

EN 

4,212,493 

2,052 

2.14 

1.20 

8,104 

432,629 

IT 

1,017,953 

1,009 

1.96 

1.05 

5,212 

91,588 

NL 

1,144,615 

1,070 

1.82 

1.10 

10,175 

102,450 

PL 

949,153 

974 

1.90 

1.04 

4,100 

112,537 


Table 1: Basic degree characteristics of Wikipedia networks. The exponents of the degree dis¬ 
tributions are estimated using the implementation of the techniques from m by Peter Bloem, 
http: / / github. com/ Data2 Semantics / power laws. 


networks m- Another example is a grand-canonical model in |26j that generates a graph with 
given average degrees using a maximum-entropy method. However, to the best of our knowledge, 
none of these methods has an efficient implementation. Even the complexity 0{NE) in [TTl ITT] , 
where N is the network size and E the number of edges, is arguably not feasible for truly large 
networks, such as Wikipedia or Twitter. 

Although for both local rewiring and the Configuration Model neutral mixing is expected, since 
there is no preference in connecting two vertices, negative correlations are observed, 01101111 ], 
for scale-free networks with infinite variance of degrees, i.e. where the degree distribution satisfies 

P{k) ~ 1 < 7 < 2. (1) 

In [20] this phenomenon is explained by observing that if one allows at most one edge between 
two vertices, nodes with large degree must connect to nodes of small degree because there are 
simply not enough distinct large nodes to connect to. A similar explanation is given in 0. Here, 
however, this is then related to the difference in scaling between the natural and structural cut-off 
of the network. The former is defined m as the degree value fee, of which, on average, only one 
instance is observed: 

pOO 

N / P{k)dk - 1. (2) 

The structural cut-off is defined as the value kg for which the ratio between the average number 
of edges that connect any two vertices of degree kg, and the maximum possible number of such 
edges in a simple graph, is 1. For networks with degree distribution m it follows from that 
the natural cut-off scales as while the structural cut-off for uncorrelated networks scales as, 

see 0, Therefore, when 7 < 2, the natural cut-off scales at a slower rate which in turn 

gives rise to structural negative correlations. 

To remedy these finite size effects the authors of 0 propose an Uncorrelated Configuration 
Model. This model follows the same procedure as the regular Configuration Model, with the 
addition that the sampled degrees are bounded, m <ki < Experiments in 0 indeed show 

that these networks are uncorrelated. However, many scale-free networks, for instance Twitter, 
have nodes who’s degree is of larger order than which is a characteristic property of scale- 

free graphs. For example. Table [I] displays the characteristics of Wikipedia networks for different 
languages. Here we see that the maximum out-degree could be considered to be of order 
while the maximum in-degree is definitely of a much larger scale. Therefore, randomized versions 
of these networks, generated by the Uncorrelated Configuration Model, do not have the same basic 
degree characteristics as the original network, since the maximum degree is restricted. Hence, they 
are less suitable for comparison of the degree-degree dependencies. 

In this paper we consider the directed Erased Configuration Model (ECM), 0, where after the 
pairing self loops are removed and multiple edges are merged. In our recent work [55], Section 5, 
we showed that this model has neutral mixing in the infinite network size limit. The idea behind 
this result is that the total average number of erased edges per node, which defines the difference 
in the correlations between the CM and the ECM, goes to zero when the size of the network 
grows. By this result, from a purely mathematical point of view the ECM is a null model for 
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Figure 1: The four different degree-degree dependency types in directed networks. 
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Figure 2: Plots of the empirical cumulative distribution of p_|_, and r|_ for ECM graphs of 
different sizes with 7 ± = 1.2. Each plot is based on 10^ realizations of the model. 


degree-degree dependencies in the limit. Moreover, asymptotically, the degree distributions are 
preserved and hence, all basic degree characteristics. Still, for finite sizes, structural dependencies 
are present. 

Rather than trying to control these correlations, our goal is to evaluate their magnitude and 
investigate their size dependence. We obtain the scaling for the structural correlations in the 
ECM, in terms of the power law exponents of the in- and out-degrees. In particular, we show that 
this scaling undergoes an interesting phase transition, and can be dominated by terms related to 
either the structural or the natural cut-off of the network. To the best of our knowledge, this 
is the first study that provides a systematic mathematical characterization for the magnitude of 
negative correlations in a simple graph with neutral mixing. 

By determining the scaling of the structural correlations we can asses the significance of mea¬ 
sured correlations as well as their influence on network processes, on real world networks of finite 
size, by comparing them to the directed Erased Configuration Model. This approach has the 
advantage of preserving the degree characteristics of the original network, it can be easily imple¬ 
mented and applied to all networks with scale-free degree distributions and finite expectation. 

2 Degree-degree dependencies in random directed networks 

We analyze degree-degree dependencies in random directed networks of size N, where the distri¬ 
bution of the out- and in-degree , D~) follow, respectively, 

P+(fc) - and p-(£) ~ £-(^-+1), 7 ± > 1. (3) 

In directed networks one can consider four types of degree-degree dependencies, depending on 
the choice of the degree type on both sides of an edge, see Figure [TJ For the remainder of this 
paper we denote by E the number of edges and adopt the notation style from [HISO] to index 
the degree types by a, /? S {-I-, —}. 
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Figure 3: Plots of the out- and in-degree distribution, on log-log scale, for a graph generated by 
the ECM, of size 10® with 7 + = 1.9 and 7 _ = 1.2, before (CM) and after (ECM) the removing of 
edges. 


A common measure for degree-degree dependencies, introduced in [ 22 ) . computes Pearson’s 
correlation coefficients on the joint data (Hf, Dj)i^j, where the indices run over all i,j for which 
there is an edge i ^ j. 

However, Pearson’s correlation coefficients are unable to measure strong negative degree-degree 
dependencies in large networks where the variance of the degrees is infinite, as was shown for 
undirected networks in [nma and for directed networks in |30) . Since our interest is mainly in 
networks in the infinite variance domain, i.e. 1 < 7 ± < 2, we need different measures. In [30] it 
was suggested to use rank correlations, related to Spearman’s rho and Kendall’s tan m , to 
measure degree-degree dependencies. 

Spearman’s rho computes Pearson’s correlation coefficient on the ranks of (Hf, D^)i^j rather 
then their actual values. Since this data will contain many ties, one needs to use ranking schemes 
that deal with these ties. In m two such schemes are considered, resolving ties at random and 
assigning an average rank to tied values, which give two correlation measures denoted by 
and respectively. Here, the subscript index denotes the degree type of the source, while the 
superscript index denotes the degree type of the target of a directed edge. For instance, p)! denotes 
Spearman’s rho for the Out-In dependency. The second rank correlation measure, Kendall’s tau 
T^, calculates the normalized number of swaps needed to match the ranks of the joint data. 

Exact formulas for these three measures, in terms of the degrees, are given in [5D]. In ^5] 
formulas are given in terms of the empirical distributions of D°‘ and and their joint distribution, 
evaluated at {Df , Dj ) for an edge f —>■ j selected uniformly at random. From these it follows that 
if the network has neutral mixing, then p^ and are similar, while p^ and p^ differ by a term 
of 0(1), which does not influence the scaling. To illustrate this we plotted the empirical cdf’s of 
P+, p+ and for a collection of ECM graphs in Figure dj where we clearly observe the similar 
behavior of the three measures. Therefore, for the analysis of degree-degree dependencies, we 
will only consider p^, which corresponds to Spearman’s rho where ties are resolved uniformly at 
random. 
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Figure 4: Plots of the empirical cumulative distribution of for all four degree-degree dependency 
types for ECM graphs of different sizes with 7 ± = 2.1. Each plot is based on 10^ realizations of 
the model. 


3 The directed Erased Configuration Model 

The directed Configuration Model (CM) starts with degree sequences {D ^that satisfy, 
for some /x > 0, 

N 

N 

Y^D+D- ~ p^N 

i^l 
N 

YiDfr p>7± 

i=l 

The stubs are then paired at random to form edges. This will in general constitute a graph with 
self-loops and multiple edges between nodes. If the degree variance is finite, then the probability 
of generating a simple graph is bounded away from zero and thus, by repeating the pairing step 
until such a graph is generated, we get a network randomly sampled from all networks of given 
size and degree sequences. This is called the Repeated Configuration Model (RCM). 

When the variance of the degrees is infinite, the probability of generating a simple graph 
converges to zero as the graph size increases, and therefore we need to enforce that the resulting 
graph is simple. For this we use the Erased Configuration Model (ECM), where, during the 
pairing, a new edge is removed if it already exists or if it is a self loop. Although this seems to 
be a strong alteration of the initial degree sequence, asymptotically, the degrees of the resulting 
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Figure 5: Plots of the empirical cumulative distribution of for all four degree-degree dependency 
types for ECM graphs of different sizes with 7 ± = 1.2. Each plot is based on 10^ realizations of 
the model. 


network still follow the same distribution, see [9]. For illustration, in Figure [3l we plotted the 
degree distributions of and ECM graph of size 10® before and after the removing of edges. Clearly 
there is hardly any difference between the two distributions. In particular the degree sequences 
of ECM graphs still satisfy Unlike many other methods, random pairing of the stubs can be 
implemented very efficiently for even billions of nodes. Moreover, the ECM is computationally 
less expensive than the RCM, since we do not need to repeat the pairing. Therefore we suggest to 
use the ECM as a standard null-model. In the rest of the paper we will characterize the structural 
dependencies in the ECM. 

4 Degree-degree dependencies in the ECM 

It is clear that when we use the CM, i.e. allow for multiple edges and self loops, then our graphs will 
have neutral mixing since all stubs are connected completely at random. For the ECM however, 
we remove edges to make the graph simple, which has been shown Enii to give rise to negative 
correlations. Nevertheless, the ECM has asymptotically neutral mixing, which can be shown as 
follows. 

Let Eij be the matrix counting the number of edges between i and j after the pairing and let 

denote the matrix counting the number of removed edges between i and j by the ECM. Then 
for the CM it holds that Df = ^ij while for the ECM we have Df = — Ef^). 

Therefore, the difference between the empirical distributions of Df and , for an edge i ^ j 
sampled at random, in the CM and ECM, will be of the order whose average, with 

respect to the degree sequences, converges to zero m, 
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N 

ipl) 

(P-) 

(pi) 

(P-) 

10000 

-0.1568 

- 0.0001 

0.0039 

0.0048 

50000 

-0.1439 

0.0001 

0.0014 

0.0029 

100000 

-0.1388 

- 0.0001 

0.0026 

0.0028 

500000 

-0.1198 

0.0001 

0.0011 

0.0017 

1000000 

-0.1131 

0.0000 

0.0009 

0.0002 


Table 2: The average values for p for all four degree-degree dependencies types, for ECM graphs 
of different sizes, with 7 ± = 1 . 2 , based on 10 ^ realizations of the model. 


1 ^ 

lim — y (E" ) = 0. (5) 

*,j=l 

This implies that the values of for an ECM graph will converge to that of a CM graph, hence, 
asymptotically, = 0 and also p^ = 0 = rf, for the ECM. 

However, for finite realizations in the infinite variance regime, negative correlations are still 
observed. To illustrate this we plotted the empirical cumulative distribution functions of pf for 
graphs generated by the ECM with both finite and infinite degree variance, see Eigure 0] and 
Eigure O respectively. In addition. Table [5] contains the average values for all four correlation 
types in the infinite variance regime. One immediately observes that the Out-In dependency in 
ECM graphs with infinite variance, Eigure I5al displays strong structural negative correlations 
which decrease as the network grows, while for the other three dependency types the values are 
concentrated around zero. Moreover, we see, Eigure 01 that all four dependency types behave 
similar when the variance of the degrees is finite. 

These negative Out-In correlations (piji) can be explained by first observing that multiple edges 
are more likely to start in a node of large out-degree and end in a node of large in-degree, since 
these are more likely to be sampled. Now, consider the algorithm as first connecting all stubs at 
random and then removing self loops and merging multiple edges. By construction, immediately 
after the pairing the network will have neutral mixing. When merging multiple edges we will often 
delete connections from nodes of large out-degree to nodes of large in-degree. Such edges have 
contributed positively into p^, thus, deleting them will shift p'^ from zero in the CM to a negative 
value in the ECM. The other three dependency types are not effected since the out- and in-degree 
of a node in the ECM are independent. 

Motivated by the analysis in this section, we will further focus on the behavior of p'^ in the 
infinite-variance case, I < 7 +, 7 _ < 2, as the only scenario where we observe prominent structural 
correlations. We will discuss other scenarios in Sectional 

5 Scaling of the Out-In degree-degree dependency in the 

ECM 

We will determine the scaling of p^ as a function of the exponents 7 ±. That is, we will find 
coefficients /( 7 +, 7 _) such that 

P+ - {P+) 

Nfh+n-) 

converges to some limiting distribution. Here the expectation (p+) is taken over all possible graphs 
of size N, generated by the ECM, with degree sequences satisfying ®. We note that although 
{P+) is of similar order as the typical spreading of p^, the latter, which we are going to evaluate, 
will define the magnitude of the structural negative correlations. 

We obtain the scaling exponents /( 7 +, 7 _) by establishing upper bounds on the scaling, and 
then show empirically that these bounds are tight. The scaling is an important quantity, charac¬ 
terizing the spread around the sample mean of p^ as a function of N. Roughly, this tells us how 
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much the measured values on a ECM graph of size N can deviate from the average and therefore 
enable us to asses the significance of the measured correlations of the corresponding real world 
networks. 


5.1 Scaling of the erased number of edges 

As we discussed in the previous section, the structural negative correlations appear after multiple 
edges and self-loops are erased. Hence, part of the scaling of comes from the scaling of the 
average total number of erased edges. The latter scaling has a phase transition, which we will 
show by establishing two different upper bounds. 

For the first upper bound, observe that 

N N N 

^ =^5.,+ ^ (6) 

i,j—l i—1 


where S is the diagonal matrix counting the number of self loops and M is the zero diagonal 
matrix that counts the excess edges, so Mij = fc > 0 means that Eij = k + 1. For the self loops it 
holds that 


{S^^) 


DtPj 

E 


(7) 


If we now take the total number of pairs of edges between i and j as an upper bound for Mij, 
then 


(M,,) < 


F2 


( 8 ) 


Applying ([7]) and ([5]) to ([5]) we get 


^ {Et,) ^ 

^ E - E^ 


F2 


(9) 


We remark that if the second moment of both the out- and in-degree exists, then this upper bound 
scales as N~^. When this is not the case, we get the scaling from Q as 

— {E^j) = O (^iV(2/7+)-H2/7-)-3^ ^ (JO) 

*,i=i 

The upper bound m is rather crude in the sense that for certain 1 < 7 ± < 2 , we have 
( 2 / 7 +) -I- ( 2 / 7 _) > 3 so that the right-hand side of (ITIll) becomes infinite as N ^ 00 . 

To get a more precise upper bound let p{n, m, L) denote the probability that none of the 
outbound stubs from a set of size n connect to an inbound stub from a set of size to, given that 
the total number of available stubs is L. We will establish a recursive relation for p{D^,DJ,E) 
by adopting the analysis from |28] . Section 4. Similarly we get, by conditioning on whether we 
pick an inbound stub of i or not, 


p{D+,D-,E) < 1^1 - p(D+ - l,D-,E- 1), 

where the upper bound comes from neglecting the event Df'+DJ > E, in which case p(D^, D~, E) = 
0. Continuing the recursion yields 


W-i 


D- 


p{Dt.D-.E)< n l-VV 


fe=0 


E-k 
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and a first order Taylor expansion then gives 

( 11 ) 

Now, recall that denotes the total number of edges between i and j in the CM, before the 
removal step. Therefore, 

(El^) = {E,,)-{\-v{Bt.D-,E)). 

Since E = (-^b) follows that 

E = 1 - ^ + I E viDt.D-.E) (12) 

*,j = l 2J = 1 

Hence, by plugging dm into (HID we arrive at the following upper bound for the total average 
number of erased edges, 


N 


N 


IE i 1 - ^ +1E (i!>) 

*,j = l 2J = 1 

The right hand side of (USD can be slightly rewritten to obtain a more informative expression, 
which is the product of N'^/E and the term 


1 

E 


N 


E 


DfD- 

Ar2 


N 


i+E 

*>i=i 


^-(D+D-)/E 

iv2 


Next, we note that (O can be seen as an empirical form of 

where, letting 7 min = min{ 7 +, 7 _}, ^ has distribution 

P^{k) - 


(14) 


(15) 


and (^) = /i^. From a classical Tauberian Theorem for regularly varying random variables, see for 
instance [T] Theorem A, it follows that (fT51) scales as When we replace E by in (HID, 

we obtain 


1 ^ D^D~ ^ 

— V - ^ -1+V 

aN ^ m ^ 

bj=i *j’=i 


-D+D-/ifiN) 

]V2 


(16) 


and observe that m is the expectation of (HID- The function f{x) = x — \ + e "^is positive, 
hence, it follows that m and m have the same scaling, N Finally, the difference between 
m and HID is dominated by the term 


1 1 
E~'ilN 


0{N-^ \E-nN\) . 


Recall that = E = ■ Hence, we obtain from the Central Limit Theorem for 

regularly varying random variables, see [32], that 

N-'^\E - ^lN\ = 0 . 

which dominates TV"'''™” when 1 < 7 ± < 2. Summarizing, we have that (HI scales as 0(A^ 2 +i/ 7 mi„^ 
and hence, since N'^/E = 0{N), it follows that 


1 ^ 

E E (^b) = 0(fV-i+i/^“-). (17) 

bi=i 
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Region _ /( 7 +: 7 -) 

A l/7min - 1 

B (2/7+) + (2/7_)-3 

C -1/2 

Table 3: The three scaling terms for p'^ for each of the three regions, displayed in Figured 


The scaling in (II3 is related to that of the structural cut off described in [5], adjusted to the 
setting of directed networks with degree distributions Moreover, comparing (HZD to (US we 
observe a phase transition, with respect to the tail exponents 7 ± of the degree distributions, in 
the scaling of the average total number of removed edges in the ECM, which will induce a phase 
transition in the scaling of the Out-In degree-degree dependency. 



Figure 6 : Plot of the different scaling regimes for p/j. The scaling terms for each of the three 
regions can be found in Table |31 The Roman numerals indicate the three different choices of 7 - 1 - 
and 7 _, used in Figure [7] and |51 to illustrated the different regimes. 


5.2 Phase transitions for the Out-In degree-degree dependency 

First we remark that for the CM, the empirical distribution of the degrees on both sides of a 
randomly sampled edge converges to the distribution of two independent random variables as 
N~^, see m- Because Spearman’s rho and Kendall’s tau on independent joint measurements 
are normal statistics ca, the scaling of their average is N Hence for CM graphs scales 
as Since an ECM graph is basically a CM graph where multiple edges are merged and 

self-loops are removed, it follows that the distributions for the degrees on both side of a randomly 
chosen edge differ from those of the CM by terms of the order Therefore, the scaling 

of p/l is determined by the largest term out of and the scaling of Since the 

latter undergoes a phase transition, we actually have a three stage phase transition for the scaling 
of p 7 in the ECM. The first stage has scaling and holds for all 7 ± for which 



7min 1+ 7- 


since both correspond to upper bounds. The next region, 7 ± such that 2 / 7 + -|- 2 / 7 _ — 3 > —1/2, 
has scaling _ Outside this region we have normal scaling, The different 

regions are displayed in Figure IH while Table |3] shows the three scaling terms. We remark that 
the phase transitions of the scaling are smooth since they are induced by inequalities on the terms. 
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Figure 7: Plots of the empirical cumulative distribution function of using different scaling 
and for different choices of 7 ±. The left column is scaled by the center column by 

and the right column by The first row is for ECM graphs with 7 ± = 1.3, 

the second for 7 + = 1.9, 7 _ = 1.3 and the third for 7 + = 1.9, 7 _ = 1.5, corresponding to points 
I, II and III, respectively, in Figure [ 6 ] 
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Figure 8: Plots of the empirical cumulative distribution function of p\ for choices of 7 ± corre¬ 
sponding to points I, II and III from Figure El using the corresponding scaling. 
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5.3 Simulations 


In order to show the phase transitions we plotted the empirical cumulative distribution function 
of for the specific choices of 7 ±, corresponding to the points I, II and III in Figure [SI For each 
of the three points we shifted the empirical data by its average and multiplied it by 
for any of the three coefficients from Table |21 corresponding to the different scaling areas A, B 
and C. The results are shown in Figure |7l When the correct scaling is applied, the corresponding 
cdf plots should almost completely overlap and resemble the cdf of some limiting distribution. We 
observe that for each of the three choices I, II and III, this is the case when the corresponding 
scaling from its area, respectively A, B and C, is chosen. 


6 Scaling of degree-degree dependencies for the other cases 



Figure 9: Plots of the empirical cumulative distribution function of pi for choices of 7 ± corre¬ 
sponding to points I, II and III from Figure |6l using square root scaling. 

In the previous section we completely characterized the scaling behavior of p^ for ECM graphs 
with infinite variance of the degrees. Here, we first discuss the remaining correlation types, p^, pi 
and pi in the infinite variance regime and lastly, we consider all four types in the finite variance 
regime. 

The intuition behind the structural negative Out-In dependencies was that multiple edges are 
more likely to exist between nodes of large out- and in-degree. The other three types do not show 
negative correlations, see Figure ISbIISdl which we argued was due to the fact that the in- and 
out-degree of a node in the ECM are independent. Nevertheless, the spread of both the Out- 
Out and In-In degree-degree dependency exhibits scaling with the same functions as the Out-In 
dependency. This is illustrated in FigurejSl where we plotted the empirical cumulative distribution 
of the Out-Out dependency for ECM graphs, for values of 7 ± corresponding to points I, II and 
III from Figure m scaled by the correct term for each of these points. This is because p^ again 
depends on the number of erased edges, through the out-degree of their source nodes. However, 
the out-degree of the target node of a removed edge can be both large or small, thus pl[l in the 
ECM remains zero on average. By symmetry, the scaling for the In-In dependency is similar. 

This non-trivial scaling is typical for the ECM. Recall that in the CM, is a normal statistic 
and scales as for any a, (3 because all degrees are independent random variables. This is 

exactly what we observe for the In-Out degree-degree dependency, which, in contrast to the other 
three, is not biased towards removed edges. As we expect, here we have normal, square root, 
scaling for ECM graphs for any choice of 7 ±. This can clearly be observed in Figure [HI where we 
plotted the empirical cumulative distributions of pi scaled by 
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Figure 10: Plots of the empirical cumulative distribution of for all four degree-degree depen¬ 
dency types for ECM graphs with 7 ± =2.1 of different sizes, scaled by Each plot is based 

on 10^ realizations of the model. 


For the degree-degree dependencies in the finite variance regime we plotted the empirical 
cumulative distributions of scaled by in Figure [TUI Since these are all completely 

similar, we took the plot for for an ECM graph of size 10® and compared it to a fitted normal 
distribution with p = 0 and cr^ = 0.8, see Figure [TT] These plots strongly overlap enforcing the 
claim that for ECM graphs with finite degree variance all four correlations are normal statistics. 

7 Conclusion and Discussion 

In this paper we analyzed degree-degree dependencies in the directed Erased Conhguration Model. 
We showed. Figure 0 that in the infinite variance regime only the Out-In dependency exhibits 
structural negative values, while all correlations behave similar when both degrees have hnite 
variance. Figure 01 We investigated the scaling of the structural negative Out-In correlations. 
These undergo a phase transition in terms of the exponents 7 ± of the degree distributions 
which we showed by establishing two upper bounds, (nni and (071, on the total average removed 
number of edges, both of which scale at different rates. Combining this with the square root scaling 
of Spearman’s rho and Kendall’s tau, we identified three regions, depending on j±, with different 
scaling. Figure 01 and illustrated their phase transitions in Figure [71 Next, we considered the 
remaining three dependency types for the infinite variance regime. We showed. Figure [51 that the 
scaling of the Out-Out and In-In correlations behaves similarly to the Out-In, even though they 
do not exhibit structural negative values, while the In-Out degree-degree dependency has square 
root scaling. Figure |9l Finally we investigated the scaling for correlations when the degrees have 
finite variance. In this case all four types have square root scaling and the plots of the cumulative 
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Figure 11: Plot of the empirical cumulative distribution function of pj^ for ECM graphs of size 
10® with 7 ± = 2.1 and a normal cumulative distribution with p = 0 and = 0.8. 

distributions are very similar, Figure 1101 This was confirmed when we compared the plot of 
for ECM graphs of size 10®, with 7 ± = 2.1, with that of a fitted normal distribution in Figure [TTl 

Our analysis shows that degree-degree dependencies in directed networks display non-trivial 
behavior in terms of scaling when the degrees have infinite variance. This scaling is important 
when doing statistical analysis of these measures or their impact on other processes on networks, 
for it determines their spread and hence enables to asses the significance of measurements. 

We showed that degree-degree dependencies for degrees with finite variance, scaled by 
converge to a normal distribution with zero mean. We have not yet been able to determine the 
variance of these distributions as a function of the tail exponents 7 ± which would completely 
characterize their behavior. 

For three of the four correlation types in the infinite variance regime, we did not determine the 
limiting distributions. This is mainly due to the fact that we expect these to be stable distributions, 
since one of the three scaling regions is due to the Central Limit Theorem for regularly varying 
random variables, hows limits are stable distributions. Although these distributions have a well 
defined characteristic function, their density function, in general, does not have an analytical 
expression. Moreover, we are dealing with discrete data and simulation of such distributions is 
a field of it’s own. Nevertheless, we do expect that Central Limit Theorems for degree-degree 
dependencies can be formulated and proven, which would fully complete their statistical analysis. 

Finally, our empirical results clearly show the analytically derived phase transitions. However, 
the region with the scaling is less distinct than the other two. One of the possible 

reasons for this is that within the area where this scaling applies, the difference in value with the 
other two terms is small. We therefore picked point II in Figure |6] such that this difference was 
large enough to distinctly show this scaling visually in the plots. 

We close by strongly suggesting to use the ECM as a null model for analysis of degree-degree 
dependencies, both for determining their impact on processes as well as significance. Although for 
the latter, values are often compared to averages, using the rewiring model m, we emphasize that 
fixing the degrees imposes strong constraints on the possible simple graphs that can be generated. 
Moreover, in real-life networks, not only wiring but also the degrees of the nodes, are a result of a 
random process. Therefore, in a null-model, it seems more natural to fix only general properties 
of the network, such as degree distributions. 
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